AITopics | df 2

Collaborating Authors

df 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

53dbd7e34fab703a639964e2d3ee9e84-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 10:55:38 GMT

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

A Random Matrix Theory Perspective on the Consistency of Diffusion Models

Wang, Binxu, Zavatone-Veth, Jacob, Pehlevan, Cengiz

arXiv.org Machine LearningFeb-4-2026

Diffusion models trained on different, non-overlapping subsets of a dataset often produce strikingly similar outputs when given the same noise seed. We trace this consistency to a simple linear effect: the shared Gaussian statistics across splits already predict much of the generated images. To formalize this, we develop a random matrix theory (RMT) framework that quantifies how finite datasets shape the expectation and variance of the learned denoiser and sampling map in the linear setting. For expectations, sampling variability acts as a renormalization of the noise level through a self-consistent relation $σ^2 \mapsto κ(σ^2)$, explaining why limited data overshrink low-variance directions and pull samples toward the dataset mean. For fluctuations, our variance formulas reveal three key factors behind cross-split disagreement: \textit{anisotropy} across eigenmodes, \textit{inhomogeneity} across inputs, and overall scaling with dataset size. Extending deterministic-equivalence tools to fractional matrix powers further allows us to analyze entire sampling trajectories. The theory sharply predicts the behavior of linear diffusion models, and we validate its predictions on UNet and DiT architectures in their non-memorization regime, identifying where and how samples deviates across training data split. This provides a principled baseline for reproducibility in diffusion training, linking spectral properties of data to the stability of generative outputs.

artificial intelligence, equivalence, machine learning, (16 more...)

arXiv.org Machine Learning

2602.02908

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

Add feedback

Model Collapse Demystified: The Case of Regression Elvis Dohmatob Y unzhen Feng Julia Kempe FAIR, Meta Center for Data Science, New York University

Neural Information Processing SystemsOct-10-2025, 02:47:39 GMT

The phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i.e. the model collapses.

model collapse, test error, theorem 4, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.40)
Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

auto-fpt: Automating Free Probability Theory Calculations for Machine Learning Theory

Subramonian, Arjun, Dohmatob, Elvis

arXiv.org Artificial IntelligenceApr-16-2025

A large part of modern machine learning theory often involves computing the high-dimensional expected trace of a rational expression of large rectangular random matrices. To symbolically compute such quantities using free probability theory, we introduce auto-fpt, a lightweight Python and SymPy-based tool that can automatically produce a reduced system of fixed-point equations which can be solved for the quantities of interest, and effectively constitutes a theory. We overview the algorithmic ideas underlying auto-fpt and its applications to various interesting problems, such as the high-dimensional error of linearized feed-forward neural networks, recovering well-known results. We hope that auto-fpt streamlines the majority of calculations involved in high-dimensional analysis, while helping the machine learning community reproduce known and uncover new phenomena.

artificial intelligence, machine learning, matrix, (13 more...)

arXiv.org Artificial Intelligence

2504.10754

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in Linear Models

Atanasov, Alexander, Bordelon, Blake, Zavatone-Veth, Jacob A., Paquette, Courtney, Pehlevan, Cengiz

arXiv.org Machine LearningFeb-7-2025

Modern deep learning practice is governed by the surprising predictability of performance improvement with increases in the scale of data, model size, and compute [17]. Often, the scaling of performance as a function of these quantities exhibits remarkably regular power law behavior, termed a neural scaling law [2, 6, 12, 13, 15, 16, 18, 19, 22, 32]. Here, performance is usually measured by some differentiable loss on the predictions of the model on a held out test set representative of the population. Given the relatively universal behavior of the exponents across architectures and optimizers [11, 18, 19], one might hope that relatively simple models of information processing systems might be able to recover the same types of scaling laws. The (stochastic) gradient descent (SGD) dynamics in random feature models were analyzed in recent works [7, 20, 26] which exhibits a surprising breadth of scaling behavior and captures several interesting phenomena in deep network training. Each of the above works has isolated various effects that can hurt performance compared to the idealized infinite data and infinite model size limits. The model was first studied in [7], where the bottlenecks due to finite width and finite dataset size were computed and, for certain data structure, resulted in a Chinchilla-type scaling result as in [18].

artificial intelligence, deterministic equivalence, machine learning, (15 more...)

arXiv.org Machine Learning

2502.05074

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Add feedback

Re-examining Double Descent and Scaling Laws under Norm-based Capacity via Deterministic Equivalence

Wang, Yichen, Chen, Yudong, Rosasco, Lorenzo, Liu, Fanghui

arXiv.org Machine LearningFeb-3-2025

The number of parameters, i.e., model size, provides a basic measure of the capacity of a machine learning (ML) model. However it is well known that it might not describe the effective model capacity (Bartlett, 1998), especially for over-parameterized neural networks (Belkin et al., 2018; Zhang et al., 2021) and large language models (Brown et al., 2020). The focus on the number of parameters results in an inaccurate characterization of the relationship between the test risk R, training data size n, and model size p, which is central in ML to understand the bias-variance trade-off (Vapnik, 1995), double descent (Belkin et al., 2019) and scaling laws (Kaplan et al., 2020; Xiao, 2024). For example, even for the same architecture (model size), the test error behavior can be totally different (Nakkiran et al., 2020, 2021), e.g., double descent may disappear. Here we shift the focus from model size to weights and consider their norm, a perspective pioneered in the classical results in Bartlett (1998). Indeed, norm based capacity/complexity are widely considered to be more effective in characterizing generalization behavior, see e.g.

artificial intelligence, machine learning, regime, (18 more...)

arXiv.org Machine Learning

2502.01585

Country:

Europe > Italy > Liguria > Genoa (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Predator Prey Scavenger Model using Holling's Functional Response of Type III and Physics-Informed Deep Neural Networks

Panchal, Aneesh, Beniwal, Kirti, Kumar, Vivek

arXiv.org Artificial IntelligenceDec-24-2024

Nonlinear mathematical models introduce the relation between various physical and biological interactions present in nature. One of the most famous models is the Lotka-Volterra model which defined the interaction between predator and prey species present in nature. However, predators, scavengers, and prey populations coexist in a natural system where scavengers can additionally rely on the dead bodies of predators present in the system. Keeping this in mind, the formulation and simulation of the predator prey scavenger model is introduced in this paper. For the predation response, respective prey species are assumed to have Holling's functional response of type III. The proposed model is tested for various simulations and is found to be showing satisfactory results in different scenarios. After simulations, the American forest dataset is taken for parameter estimation which imitates the real-world case. For parameter estimation, a physics-informed deep neural network is used with the Adam backpropagation method which prevents the avalanche effect in trainable parameters updation. For neural networks, mean square error and physics-informed informed error are considered. After the neural network, the hence-found parameters are fine-tuned using the Broyden-Fletcher-Goldfarb-Shanno algorithm. Finally, the hence-found parameters using a natural dataset are tested for stability using Jacobian stability analysis. Future research work includes minimization of error induced by parameters, bifurcation analysis, and sensitivity analysis of the parameters.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.18344

Country:

North America > United States > Rocky Mountains (0.04)
North America > United States > Colorado > Larimer County > Fort Collins (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Food & Agriculture > Agriculture (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Effective Theory of Bias Amplification

Subramonian, Arjun, Bell, Samuel J., Sagun, Levent, Dohmatob, Elvis

arXiv.org Machine LearningOct-28-2024

Machine learning models may capture and amplify biases present in data, leading to disparate test performance across social groups. To better understand, evaluate, and mitigate these possible biases, a deeper theoretical understanding of how model design choices and data distribution properties could contribute to bias is needed. In this work, we contribute a precise analytical theory in the context of ridge regression, both with and without random projections, where the former models neural networks in a simplified regime. Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias in various feature and parameter regimes. For example, we demonstrate that there may be an optimal regularization penalty or training time to avoid bias amplification, and there can be fundamental differences in test error between groups that do not vanish with increased parameterization. Importantly, our theoretical predictions align with several empirical observations reported in the literature.

artificial intelligence, bias amplification, machine learning, (16 more...)

arXiv.org Machine Learning

2410.17263

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Russia (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Strong Model Collapse

Dohmatob, Elvis, Feng, Yunzhen, Subramonian, Arjun, Kempe, Julia

arXiv.org Machine LearningOct-8-2024

Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1\% of the total training dataset) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and feed-forward neural networks for images.

correspond, model collapse, synthetic data, (16 more...)

arXiv.org Machine Learning

2410.0484

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Risk and cross validation in ridge regression with correlated samples

Atanasov, Alexander, Zavatone-Veth, Jacob A., Pehlevan, Cengiz

arXiv.org Machine LearningAug-11-2024

Recent years have seen substantial advances in our understanding of high-dimensional ridge regression, but existing theories assume that training examples are independent. By leveraging recent techniques from random matrix theory and free probability, we provide sharp asymptotics for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations. We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk. However, in the case where the noise residuals have the same correlations as the data points, one can modify the GCV to yield an efficiently-computable unbiased estimator that concentrates in the high-dimensional limit, which we dub CorrGCV. We further extend our asymptotic analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting. Assuming knowledge of the correlation structure of the time series, this again yields an extension of the GCV estimator, and sharply characterizes the degree to which such test points yield an overly optimistic prediction of long-time risk. We validate the predictions of our theory across a variety of high dimensional data.

correlation, df 1, df 2, (16 more...)

arXiv.org Machine Learning

2408.04607

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

Add feedback